In [1]:
__author__ = 'Stephanie Juneau <stephanie.juneau@noirlab.edu>, David Herrera <david.herrera@noirlab.edu>, and the Astro Data Lab Team <datalab@noirlab.edu>'
__version__ = '20230424' # yyyymmdd
__datasets__ = ['ls_dr3','sdss_dr13']
__keywords__ = ['extragalactic','galaxies','joint query','spectroscopic redshift','3d plot']

DECaLS and SDSS/BOSS Large Scale Structure¶

by Stéphanie Juneau, David Herrera, and the Astro Data Lab Team

Table of contents¶

  • Goals & Summary
  • Disclaimer & attribution
  • Imports & setup
  • Joint Query of LS and SDSS catalogs
  • Plot Results
  • Exercise
  • 3D plot (RA,DEC and z)

Goals¶

  • Joint query between photometric (LS) and spectroscopic (SDSS) catalogs
  • Plot on-sky position of extragalactic objects, color-coded by redshift
  • Plot, visualize and move around positions of extragalactic objects in 3D

Summary¶

In this Notebook, we explore large-scale structures of galaxies by combining spectroscopic redshifts from SDSS/BOSS with photometry from the DESI pre-imaging Legacy Survey (LS). The advantage of spectroscopic redshifts is that they are far more accurate than photometric redshifts to probe distances to galaxies (though still need to be corrected for possible distortion effects such as the finger-of-God effect, which we ignore here). The advantage of the LS photometry is that it reaches deeper than SDSS by about 1 magnitude, which yields better image quality to measure magnitudes, colors, and galaxy shapes. While there are several possible extensions to the example work included below, we will show that a simple figure of galaxy spatial locations color-coded by galaxy morphological type reveals the known morphology-density relation.

We wanted to extend indeed a little further and be able to visualize and even interact with a representation of these galaxies in the actual space. For that, we developed a 3D plot based directly (as a flat cube, without any projection or correction) on RA, DEC and z, and that is at the end of this NB.

On a technical point of view, this short notebook illustrates an example joint query between the LS DR3 photometry Tractor table, and the SDSS/BOSS DR13 specObj spectroscopy table. It takes advantage of the fact that there is a version of the LS DR3 tractor table that was pre-matched to SDSS/BOSS DR13 so we can join on a common column called specobjid.

The columns from the LS table used (Tractor, pre-matched to specObj DR13) can be seen here: http://datalab.noirlab.edu/query.php?name=ls_dr3.dr3_specobj_dr13

The columns from the SDSS/BOSS DR13 table can be seen here: http://datalab.noirlab.edu/query.php?name=sdss_dr13.specobj

The columns from the SDSS DR17 used for the 3D plot can be found here http://datalab.noirlab.edu/query.php?name=sdss_dr17.specobj

Disclaimer & attribution¶

If you use this notebook for your published science, please acknowledge the following:

  • Data Lab concept paper: Fitzpatrick et al., "The NOAO Data Laboratory: a conceptual overview", SPIE, 9149, 2014, http://dx.doi.org/10.1117/12.2057445

  • Data Lab disclaimer: http://datalab.noirlab.edu/disclaimers.php

Imports and setup¶

Please note that this notebook is written for Python 3.

In [2]:
# std lib
from getpass import getpass

# 3rd party
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
import matplotlib.patches as patches
from scipy.stats import binned_statistic_2d
%matplotlib inline
from astropy.table import Table
import plotly
import plotly.graph_objs as go
import pandas as pd
plotly.offline.init_notebook_mode()

# Data Lab
from dl import queryClient as qc
from dl import authClient as ac

print('Done importing')
Done importing

Authentication¶

In [3]:
# Uncomment the next 3 lines in case authentication is needed:
#token = ac.login(input("Enter user name: (+ENTER) "),getpass("Enter password: (+ENTER) "))
#if not ac.isValidToken(token):
#    raise Exception('Token is not valid. Please check your usename/password and execute this cell again.')

Query DECaLS Tractor Photometry Catalog¶

The photometry is derived from Tractor modeling of sources, and the database includes model photometry, type (shape), as well as other quantities.

The Legacy Survey DR3 database is called ls_dr3 and includes several tables. We will use the dr3_specobj_dr13 table, which is a version of the LS DR3 tractor table which was pre-matched with specObj table from SDSS/BOSS DR13. The column names and descriptions can be found from the Data Lab Query Interface or using the Table Access Protocol (TAP) service with, e.g., TOPCAT. In both cases, we are interested in ls_dr3.dr3_specobj_dr13.

The SDSS DR13 database is called sdss_dr13 and also includes several tables. We will use the specObj table, which has spectroscopic information.

In [4]:
%%time
# number of rows from LS DR3 tractor (NOTE: tractor is the main photometry table):
query="SELECT nrows FROM tbl_stat WHERE schema='ls_dr4' and tbl_name='tractor'"

# Call query manager
response = qc.query(sql=query, fmt='csv')

print(response)
print('Time')
nrows
187967232

Time
CPU times: user 30.5 ms, sys: 5.61 ms, total: 36.1 ms
Wall time: 137 ms
In [5]:
%%time
# number of rows from SDSS specObj DR13:
query="SELECT nrows FROM tbl_stat WHERE schema='sdss_dr13' and tbl_name='specobj'"

# Call query manager
response = qc.query(sql=query, fmt='csv')

print(response)
print('Time')
nrows
4411200

Time
CPU times: user 25.5 ms, sys: 2.34 ms, total: 27.8 ms
Wall time: 86.4 ms
In [6]:
# ls_dr3.dr3_specobj_dr13           #DECaLS matched to SDSS DR13 specobj
# sdss_dr13.specobj                 #SDSS DR13 specobj

# Write query statement (adql)
query = ("""SELECT 
           L.ra,L.dec,L.type,L.g_r,L.r_z,
           S.z,S.plug_ra,S.plug_dec,S.class, 
           S.vdisp,S.vdisp_err 
         FROM ls_dr3.dr3_specobj_dr13 as L JOIN sdss_dr13.specobj as S 
         ON L.specObjId = S.specobjid 
         WHERE L.ra BETWEEN %s and %s and L.dec BETWEEN %s and %s and (L.ra_ivar > 0) 
         limit 100000""") %(126,131,7.,12.)  #small region

# L.ra, L.dec      = RA, Dec from Legacy Survey (LS) table    
# L.type           = object type (PSF, SIMP, EXP, DEV, COMP)
# L.g_r, L.r_z     = pre-computed g-r and r-z colors from photometry
# S.z              = redshift (z) from SDSS specObj table
# S.plug_ra,dec    = RA, Dec of SDSS fiber from specObj table
# S.class          = Source class (Star, Galaxy, QSO) from SDSS
# S.vdisp,vdisp_err= velocity dispersion (and error) from SDSS specObj table
#
# WHERE: requirement that RA & Dec coordinates are within a rectangular region

print(query)
SELECT 
           L.ra,L.dec,L.type,L.g_r,L.r_z,
           S.z,S.plug_ra,S.plug_dec,S.class, 
           S.vdisp,S.vdisp_err 
         FROM ls_dr3.dr3_specobj_dr13 as L JOIN sdss_dr13.specobj as S 
         ON L.specObjId = S.specobjid 
         WHERE L.ra BETWEEN 126 and 131 and L.dec BETWEEN 7.0 and 12.0 and (L.ra_ivar > 0) 
         limit 100000
In [7]:
%%time
# Call query manager
response = qc.query(adql=query, fmt='csv')

print('Time')
Time
CPU times: user 40.2 ms, sys: 6.04 ms, total: 46.2 ms
Wall time: 838 ms
In [8]:
# Reformat output into a table
result = Table.read(response, format='csv')  #dictionary

result[:10]
Out[8]:
Table length=10
radectypeg_rr_zzplug_raplug_decclassvdispvdisp_err
float64float64str4float64float64float64float64float64str6float64float64
126.00007916690087.034954345152388DEV1.3311860.8108790.212844126.000080000000037.0349361GALAXY284.430515.617004
126.01493851068637.007749700524829DEV1.4159090.8302480.243859126.014957.0077364GALAXY229.5399523.60804
126.026624965454377.04572519907787DEV1.4779150.8415030.259155126.026630000000017.0457033GALAXY221.3285512.791714
126.024017221428857.067081473462699EXP1.0842110.9417470.063915126.024227.0671055GALAXY144.7776310.631522
126.01352914478477.072400035246026COMP0.9749090.7300960.064388126.013567.0723868GALAXY149.0151812.994181
126.003557892108697.10391918975098PSF0.8256570.7449551.289764126.003577.1038999QSO0.00.0
126.08378045088057.062649354011356DEV1.6225260.8757320.31347126.083797.0626316GALAXY273.9194622.274921
126.110287737925337.059580140782136PSF-0.3096080.8021472.227892126.110290000000027.0595677QSO0.00.0
126.13339670601257.0422990856476755PSF0.2792490.089760.00027126.133417.0422755STAR0.00.0
126.138607490182847.043995358472932DEV1.9630741.2092250.509449126.138627.0439339GALAXY349.8328287.22228

Plot Results¶

Sanity check: RA, Dec positions from both tables¶

In [9]:
# convert RA coordinates from [0,360] to [-180,180] 
chgsign = np.where(result['ra'] > 180)
result['ra'][chgsign] = result['ra'][chgsign]-360.
result['plug_ra'][chgsign] = result['plug_ra'][chgsign]-360.

plt.figure(figsize=(9,8))

# plot RA, Dec from LS catalog in red with larger symbols
plt.scatter(result['ra'],result['dec'],s=3,color='red',marker='1')

# overplot RA, Dec from SDSS catalog in blue with smaller symbols
plt.scatter(result['plug_ra'],result['plug_dec'],s=3,color='black',marker='2')

# Extent of RA, Dec (in degrees) to plot
xmin = 126.
xmax = 131.
ymin = 7.
ymax = 12.

plt.axis([xmin, xmax, ymin, ymax])
plt.xlim(reversed(plt.xlim())) # flip the x-axis
plt.xlabel("RA (degrees)", fontsize=20)
plt.ylabel("Dec (degrees)", fontsize=20)
plt.show()

Visual Inspection of Large-Scale Structures¶

Plot the positions of a broad range of redshift, and overplot a thin slice in redshift to show possible structures within that slice.

In [10]:
# Select redshift slice
rz = np.where((result['z'] >0.105) & (result['z']<0.125))

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 6.5))

# plot all points in red (all redshifts)
ax1.scatter(result['plug_ra'],result['plug_dec'],s=3,color='r',marker='o',alpha=0.25)

# overplot in blue objects in narrow redshift slice
ax1.scatter(result['plug_ra'][rz],result['plug_dec'][rz],s=10,color='black')

# Extent of RA, Dec (in degrees) to plot
xmin = 126.
xmax = 131.
ymin = 7.
ymax = 12.

ax1.axis([xmin, xmax, ymin, ymax])
ax1.set_xlim(reversed(ax1.set_xlim())) # flip the x-axis
ax1.set_xlabel("RA (degrees)", fontsize=20)
ax1.set_ylabel("Dec (degrees)", fontsize=20)

# add rectangle to show where we zoom in next panel
ax1.add_patch(patches.Rectangle((128.65-0.25, 8.85-0.2),0.5,0.4,fill=False,color='b'))


## ZOOM IN A SMALLER REGION

# plot all points in red (all redshifts)
ax2.scatter(result['plug_ra'],result['plug_dec'],s=15,color='r',marker='o',alpha=0.3)

# overplot in blue objects in narrow redshift slice
ax2.scatter(result['plug_ra'][rz],result['plug_dec'][rz],s=30,color='black')

# Extent of RA, Dec (in degrees) to plot
xmin = 128.4
xmax = 128.9
ymin = 8.65
ymax = 9.05

ax2.axis([xmin, xmax, ymin, ymax])
ax2.set_xlim(reversed(ax2.set_xlim())) # flip the x-axis
ax2.set_xlabel("RA (degrees)", fontsize=20)
ax2.set_ylabel("Dec (degrees)", fontsize=20)

# add rectangle to show where we zoom in next panel
ax2.add_patch(patches.Rectangle((128.63, 8.81),0.13,0.107,fill=False,color='b'))

plt.show()

Above, the left-hand panel shows a thin redshift slice (0.105 < z < 0.125, black symbols) among objects with redshifts from the spectroscopic SDSS DR13 sample (red symbols). We can see by eye some large-scale filamentary structures and overdensities. The blue rectangle shows a selected region where we zoom in the right-hand panel. On the latter, we further select a smaller region, which we will use in the next cell below.

In [11]:
## ZOOM IN AGAIN OVER AN EVEN SMALLER REGION
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 5.5))

# plot all points in red (all redshifts)
ax1.scatter(result['plug_ra'],result['plug_dec'],s=30,color='r',marker='o',alpha=0.25)

# overplot in black objects in narrow redshift slice
ax1.scatter(result['plug_ra'][rz],result['plug_dec'][rz],s=50,color='black')

# Extent of RA, Dec (in degrees) to plot
xmin = 128.63
xmax = 128.76
ymin = 8.81
ymax = 8.917

ax1.axis([xmin, xmax, ymin, ymax])
ax1.set_xlim(reversed(ax1.set_xlim())) # flip the x-axis
ax1.set_xlabel("RA (degrees)", fontsize=20)
ax1.set_ylabel("Dec (degrees)", fontsize=20)

## SHOW DECaLS IMAGE (screenshot pre-made but could instead implement image cutout)
im = plt.imread('DECaLS_screenshot_zoomIn_labels.jpg')
ax2.imshow(im)
ax2.axis('off')

plt.show()

The left-hand panel shows the small region enclosed in the blue rectangle that we chose above (right-hand panel). The galaxies in black are in the same narrow redshift slice as defined previously (0.105 < z < 0.125). The right-hand panel is an image cutout of the same region of the sky from the LS sky viewer. The galaxies encircled correspond to the points in black, and some or perhaps most of them likely belong to a galaxy cluster.

Large-Scale Structures with LS Morphologies¶

There are many possible extensions to this work. For instance, one could plot again with symbols coded with object type (from LS) and/or class (from SDSS) and/or velocity dispersion (from SDSS) and/or other quantities. Here, we will start with the object "TYPE" from LS, related to the morphological shapes.

The object shape (2D light profile) is modeled by the Tractor (Lang, Hogg & Mykytyn) as part of the procedure to compute model photometry.

Possible shapes for LS DR3:

  • PSF (point spread function: size will vary with the seeing of the observations)
  • SIMP (“simple” galaxies: round, exponential profile with 0.45″ effective radius)
  • EXP (exponential profile; spiral galaxies)
  • DEV (deVaucouleurs profile; elliptical galaxies)
  • COMP (composite deVaucouleurs+exponential at same centroid)

Please note that starting in DR4 and in subsequent data releases, the SIMP model as been replaced with a REX model (Round Exponential).

**Figure:** Images of galaxies including a nearby elliptical galaxy, a nearby spiral galaxy, and a QSO.
In [12]:
# Select redshift slice
rz = np.where((result['z'] >0.105) & (result['z']<0.125))

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 6.5))

# plot all points in red (all redshifts)
ax1.scatter(result['plug_ra'],result['plug_dec'],s=3,color='r',marker='o',alpha=0.25)

# overplot in black objects in narrow redshift slice
ax1.scatter(result['plug_ra'][rz],result['plug_dec'][rz],s=10,color='black')

# Extent of RA, Dec (in degrees) to plot
xmin = 126.
xmax = 131.
ymin = 7.
ymax = 12.

ax1.axis([xmin, xmax, ymin, ymax])
ax1.set_xlim(reversed(ax1.set_xlim())) # flip the x-axis
ax1.set_xlabel("RA (degrees)", fontsize=20)
ax1.set_ylabel("Dec (degrees)", fontsize=20)


# Select redshift slice and per type
rdev  = np.where((result['type']=='DEV') & (result['z'] >0.105) & (result['z']<0.125))
rexp  = np.where((result['type']=='EXP') & (result['z'] >0.105) & (result['z']<0.125))
rcomp = np.where((result['type']=='COMP') & (result['z'] >0.105) & (result['z']<0.125))

# plot all points in red (all redshifts)
ax2.scatter(result['plug_ra'],result['plug_dec'],s=3,color='gray',marker='o',alpha=0.25)

# overplot in blue objects in narrow redshift slice
ax2.scatter(result['plug_ra'][rexp],result['plug_dec'][rexp],s=15,color='b')         # blue = EXP
ax2.scatter(result['plug_ra'][rdev],result['plug_dec'][rdev],s=15,color='r')         # red  = DEV
ax2.scatter(result['plug_ra'][rcomp],result['plug_dec'][rcomp],s=15,color='orange')  # orange = COMP

plt.axis([xmin, xmax, ymin, ymax])
plt.xlim(reversed(plt.xlim())) # flip the x-axis
plt.xlabel("RA (degrees)", fontsize=20)
plt.ylabel("Dec (degrees)", fontsize=20)
plt.show()

EXERCISE: Large-Scale Structures with Galaxy Colors¶

There are pre-computed colors available. The columns are described here: http://datalab.noirlab.edu/query.php?name=ls_dr3.dr3_specobj_dr13

Another possibility would be to plot again the galaxies spatial coordinates, but color-coded according to their photometric colors. This is left as an exercise for the user, but feel free to get in touch with the Data Lab Team if you have questions.

Interactive 3D plot (RA, DEC and z)¶

We now turn to a different area of the sky. We want to identify filaments and clumps of galaxies in 3D, moving around and 'traveling' through space.

We want to take an interesting section of the sky, where we could observe galaxies clustering and/or filaments of the large scale structure of the universe, as we did in the previous section, except that now we can move through 'space' within those galaxies and see them from different points of view in that section of the universe.

We chose an are far from the galactic plane of the Milky Way

Next, we will get the SDSS data for positions and redshifts, and define our boundaries.

In [13]:
#Create the query to fetch the SDSS data from DataLab:
query = ("""SELECT 
            ra,dec,z
            FROM sdss_dr17.specobj
            WHERE ra BETWEEN %s AND %s AND
                  dec BETWEEN %s AND %s   AND
                  z < %s""")%(175,200,5,29,.4)
print (query)
SELECT 
            ra,dec,z
            FROM sdss_dr17.specobj
            WHERE ra BETWEEN 175 AND 200 AND
                  dec BETWEEN 5 AND 29   AND
                  z < 0.4
In [14]:
%%time
# Fetch the SDSS data from ls_dr17.specobj

selection = qc.query(adql=query, fmt='csv')

print('Time')
Time
CPU times: user 72.7 ms, sys: 26.3 ms, total: 99 ms
Wall time: 1.74 s

Let's check if the table seems to make sense

In [15]:
# Reformat output into a table
data = Table.read(selection, format='csv')  #dictionary

data[:10]
Out[15]:
Table length=10
radecz
float64float64float64
175.013835.00709170.13068146
175.066235.02613220.07325106
175.056555.06559780.15870523
175.059325.14553090.13598974
175.064485.14584980.13935758
175.019095.1898160.13548651
175.131319999999965.0611380.3444923
175.245885.06428080.01940782
175.254665.09000710.13520591
175.254385.1024310.13701823

Plotting in 3D¶

We want to take an interesting section of the sky, where we could observe galaxies clustering and/or filaments of the large scale structure of the universe, as we did in the previous section, EXCEPT, that now we can move through 'space' within those galaxies and see them from different points of view in that section of the universe.

In the following cell, we will get the SDSS data for coordinates and redshift, and define our boundaries:

In [16]:
# Define the coordinate variables for plotting.
RA = data['ra']
DEC = data['dec']
z = data ['z']

ra_ = RA
dec_ = DEC
z_ = z

sel = (ra_ > 180) & (ra_ < 195) & (dec_ > 5) & (dec_ < 20) & (z_ > 0.02) & (z_ < 0.4)
ra_ = ra_[sel]
dec_ = dec_[sel]
z_ = z[sel]

print('Range for RA values: %s    %s' %(np.round(np.amin(ra_),3),np.round(np.amax(ra_))))
print('Range for DEC values: %s    %s' %(np.round(np.amin(dec_),2),np.round(np.amax(dec_),2)))
print('Range for z values: %s    %s' %(np.round(np.amin(z_),2),np.round(np.amax(z_),2)))
Range for RA values: 180.001    195.0
Range for DEC values: 5.0    20.0
Range for z values: 0.02    0.4
In [17]:
print('There are %s points to plot'%len(ra_))
There are 28668 points to plot

Now we prepare the trace to plot

In [18]:
trace = go.Scatter3d(
    x = ra_,
    y = dec_,
    z = z_,
    mode = 'markers',
    marker = {
        'size'      : 1.5,
        'opacity'   : 0.,
        'color'     : z_, 
        'colorscale': 'Turbo'
    }
)

Next, we define camera location and the layout of the plot

In [19]:
camera = dict(
    up = dict(x = 0, y = 1, z = 0),
    center = dict(x = 0, y = 0, z =0),
    eye = dict(x = 0, y = -2.5, z = -2.5)
)

name = 'eye = (x = 0, y = 2.5, z = 0)'
layout = go.Layout(
    scene = dict(
        xaxis = dict(range=[180,195],
                     title = 'RA [degrees]',
                     backgroundcolor = 'black',
                     showbackground=True,
                     gridcolor = "black",
                     ),
        yaxis = dict(range = [5,20],
                     title = 'Dec [degrees]',
                     backgroundcolor = 'black',
                     showbackground=True,
                     gridcolor = "black",
                     ),
        zaxis = dict(range = [0.02,0.4],
                     title = 'z',
                     backgroundcolor = 'black',
                     showbackground=True,
                     gridcolor = "black",
                     ),
    ),
        scene_camera = camera,
        plot_bgcolor = 'black',
        paper_bgcolor = 'black',
        title = None,
        showlegend=False,
        width = 800,
        height = 800,
        autosize = False,
        margin = {'l':0, 'r':0, 'b':0, 't':0}
)
In [20]:
data = [trace]
In [22]:
# Now we finally draw the plot:

print ('')
print ('Feel free to zoom in and zoom out with your mouse or pad')
print ('You can immerse yourself in the plot!! Basically, you can travel through space')
print ('Rotate and fly through the points, so you can notice the filaments and clusters from different points of view')
print ('Hover over different points and a pop up tags will reveal their RA, Dec and z')
print ('')
print ('NOTE: Try to see the z axis from the side and notice there is a \'layer\' effect in z. It is NOT clear why that is')
print ('')

plot_figure = go.Figure(data = data, layout = layout)
        
plot_figure.update_layout(scene_camera=camera, title = name)

plot_figure.update_scenes(
     camera_eye_x = 0.1,
     camera_eye_y = 0.0,
     camera_eye_z = -1.3,
    )
Feel free to zoom in and zoom out with your mouse or pad
You can immerse yourself in the plot!! Basically, you can travel through space
Rotate and fly through the points, so you can notice the filaments and clusters from different points of view
Hover over different points and a pop up tags will reveal their RA, Dec and z

NOTE: Try to see the z axis from the side and notice there is a 'layer' effect in z. It is NOT clear why that is

In [ ]: